-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@concretevitamin I find it hard to actually use config options in a physical operator. Any suggestions? |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
|
@rxin If we are simply trying to read the default values for the params, but not user-set ones (i.e. in the absence of a |
|
It would be great to add this into Aggregator as well. Would that replace the implementation here? I.e. does Spark SQL go through Aggregator? |
|
Spark SQL doesn't currently use the aggregator, but we would want to do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man, those are some high standards!
* CARMEL-6367: Insert bloom filter if it is skew bucket join * Fix * fix * fix * fix
https://github.pie.apple.com/IPR/apache-incubator-iceberg/compare/IPR:9a2d360...IPR:48834b0 Internal: Change Default Optimize Threshold Internal (Boson): Bump Boson version to 0.3.23 and remove the fallbac… Internal(Boson): Populate spark.boson.exceptionOnDatetimeRebase to Bo… Releases Apple Iceberg 1.3.0.5 (apache#1152)
This avoids building up an expensive hash map if partial aggregation does not result in data size reduction.
Just a prototype. Kinda ugly, doesn't properly connect with the config system yet, and have no test.